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Abstract 

It is increasingly realised that the molecular clock does not tick at a constant rate. Rather, mitochondrial mutation rates are 
influenced by factors such as generation length and body mass. This has implications for the use of genetic data in species 
delimitation. It could be that speciation, as recognised by avian taxonomists, is associated with a certain minimum genetic 
distance between sister taxa, in which case we would predict no difference in the cytochrome b divergence of sister taxa 
according to the species' body size or generation time. Alternatively, if what taxonomists recognise as speciation has tended 
to be associated with the passage of a minimum amount of time since divergence, then there might be less genetic 
divergence between sister taxa with slower mutation rates, namely those that are heavier and/or with longer generation 
times. After excluding non-flying species, we analysed a database of over 600 avian sister species pairs, and found that 
species pairs with longer generation lengths (which tend to be the larger species) showed less cytochrome b divergence. 
This finding cautions against using any simple unitary criterion of genetic divergence to delimit species. 
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Introduction 

In the face of mounting evidence that the rate of molecular 
evolution varies between different lineages, biologists have been 
increasingly obliged to abandon the simple, albeit appealing, idea 
of a molecular clock ticking at a constant rate [1]. As knowledge of 
the specifics of rate variation has grown, so the emphasis has 
shifted towards trying to understand the basis for variation. Several 
factors, which may be linked to one another, for example body 
size, metabolic rate, generation length and population size, have 
been identified as correlates of the rate of molecular evolution [2- 
4]. 

While body size may be correlated with the rate of molecular 
evolution, that correlation does not of itself shed light on the 
underlying cause of the correlation. First, the correlation might be 
mediated via generation length [5]. Species that are smaller tend 
to have shorter generation lengths and thus, per unit time, a higher 
number of DNA replication rounds within the germlrne and, 
consequently, a greater probability of replication errors [4] , [6] . 

Second, the correlation between body size and rate of molecular 
evolution might be mediated via metabolic rate. Among homeo- 
therms, such as birds, whole-body metabolic rate scales with body 
mass to the power of approximately 3 A and mass-specific 
metabolic rate to the power of - % [7-8], but see also ref [9]. 
That scaling could lead to a reduced rate of mitochondrial 
respiration in larger animals and a reduced rate of production of 
mutagenic free radicals, a by-product of aerobic respiration [10- 
11]. Whether such scaling in the somatic cells also affects the 
germ-line cells, from where mutations will be transmitted to the 
next generation, remains unclear [12-13]. 



A separate hypothesis holds that smaller population sizes, which 
are often associated with larger body size [14], may allow more 
slightly deleterious mutations to drift to fixation. This leads to a 
prediction of an inverse relation between effective population size 
and rate of molecular evolution, a prediction for which Woolfit 
and Bromham [15] obtained support from island populations. It 
also leads to the possibility that larger species, whose generally 
smaller populations will be exposed to a greater risk of fixing 
deleterious mutations, may invest more in DNA copy fidelity and 
repair mechanisms [4], [16] - which would enhance the inverse 
relationship between generation time and rate of molecular 
evolution. This enhanced reduction of mutation rate in long- 
lived, as opposed to short-lived, species is the 'longevity hypothesis' 
[13], [16]. Its predictions overlap with those of the generation time 
hypothesis of the previous paragraph. 

In the past, when the biological species concept prevailed [17], 
genetic distance data were not widely used to inform species 
delimitation. However, over the last three decades as, first, amino- 
acid and then nucleic acid sequence data became available, so the 
phylogenetic species concept has risen in prominence [18], not 
least because sequence data are well suited for phylogenetic 
reconstruction [19]. While ornithologists have sometimes struggled 
with how best to integrate molecular data into species diagnosis 
[20-21], such data are frequently used in practice [22-25]. This 
renders it crucial to understand factors impinging on the genetic 
distance between related species. If a suite of factors influences 
mutation rate, then this information should certainly be noted and 
possibly be incorporated when genetic data are used to delimit 
species. 
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In fact, it has long been recognized that there is no uniform level 
of genetic divergence between vertebrate species, whether 
allozyme data [26] and/or DNA sequence data (e.g. from 
cytochrome b [27]) are used. This is to be expected, because the 
times since sister taxa first diverged and then speciated will vary. 
That proviso acknowledged, if speciation tended to be associated 
with a certain minimum genetic distance between sister taxa, then 
we would predict no difference in the cyt b divergence of sister taxa 
according to body size or generation time. Alternatively, if 
speciation tended to be associated with the passage of minimum 
amount of time since divergence, sufficient, for example, for 
reproductive isolation to develop, then there might be less genetic 
divergence between sister taxa that are heavier and/ or with longer 
generation times [28], because, as discussed above, these species 
will have experienced lower rates of molecular evolution at neutral 
marker sites. 

Our paper addresses the possibility of a relationship between the 
cyt b divergence of avian sister taxa, the response variable in the 
analyses, and the average mass, mass difference and generation 
length of those taxa. To our knowledge, this possibility has not 
hitherto been investigated. We chose cyt b as a gene widely used in 
avian barcoding studies [29]. Crucially, we emphasize we are 
taking current species as a given, and asking whether we can 
identify factors that are correlated with the degree of genetic 
difference between those species. 

Methods 

Genetic distance values 

Gene sequences were obtained from Genbank for the 
mitochondrial gene cyt b. All available sequences within a 
particular genus were downloaded into MEGA4 [30], provided 
that at least half of the species within the genus had sequences 
available. The consequences of using a more stringent 75% cut- 
off, as opposed to the 50% cut-off described, are trivial (see 
Results). Sequences within each genus were then aligned using 
ClustalW [31] within MEGA4, and were adjusted by eye. Any 
sequences that did not align, or were much shorter than all others, 
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Figure 2. Scatterplot of the effect of generation time of species 
pairs on the cyt b divergence. There is a significant negative 
relationship (linear model - y = 0.2239-0.0026x, adjusted R 2 = 0.017, 
Fi,622 = H-67, p<0.001). 
doi:10.1371/joumal.pone.0085006.g002 

were removed. The ends of sequences were then trimmed so that 
all were of the same length within the alignment group. 

Once the sequences were aligned, molecular phylogenies were 
created for the various individual genera, using both the maximum 
parsimony and neighbour joining methods, in order to find the 
sister species pairs within that genus, including the cases where a 
given sister species pair was already known. In many cases, 
published phylogenies supported the pairings. Species involved in 
unresolved polytomies were discarded. These genus-level phylog- 
enies were not used in the wider phylogenetic analysis discussed 
below and reported in the Results. 

After sister species pairs were found, the genetic distance 
between them was calculated, using the Tamura-Nei model [32]. 
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Figure 1 . Scatterplot of the effect of the percentage difference 
in species' mass in each pair ( = (absolute difference in mass/ 
mass of the heavier species)*100) on the cyt b divergence. 

There is a significant positive relationship (linear model - 
y = 0.1948+0.00075x, adjusted R 2 = 0.015, F 1 548 = 9.29, p = 0.002). 
doi:1 0.1 371 /journal.pone.0085006.g001 



Figure 3. Scatterplot of the effect of the average mass of a 
species pair on the cyt b divergence, excluding all flightless 
species. There is a significant negative correlation between the 
variables (linear model - y = 0.21 18 - 0.00001 1x, adjusted R 2 = 0.017, 
^1,622= 12.00, p<0.001). 
doi:10.1371/journal.pone.0085006.g003 
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The Tamura-Nei model accounts for unequal nucleotide frequen- 
cies, and different rates of nucleotide transitions and transversions. 
Where multiple sequences were present for either species, the 
average value of divergence between the species was calculated. It 
is worth noting that, as our comparisons involved sister species, 
saturation of sites is unlikely to be a serious problem. The full 
dataset is presented in the Table SI. 

Mass data 

Data on species masses were obtained from [33]. Where a mass 
range rather than mean value was given for a species, the median 
of this range was taken. The mean value for each species was 
calculated if more than one value was given (i.e. separate masses 
provided for males and females). For analyses of average body 
mass, a single value was needed for each pair, rather than separate 
values for each species, so the mean value of the masses was 
calculated. Where mass data were only available for one of the 
species pair, then that value was taken to represent both species. 
However, for analyses involving differences in species' mass, we 
could only use those pairs where mass data were available for both 
species. The difference in mass was calculated and then expressed 
as a percentage of the mass of the heavier member of the pair. 

Non-flying species, ratites and penguins, were excluded from the 
mass and generation time analysis since determinants of these 
species' masses are likely to be different to those of volant birds. 

Generation length 

Species generation lengths were obtained from a database 
supplied by BirdLife International (http://www.birdlife.org/ 
datazone/home) which, following [34], defines generation length 
as the average age of the parents of the current cohort. 

Statistical analysis 

Statistical analyses were carried out in R [35] and R 
commander [36]. Initial histograms of the cyt b divergences 
revealed that the data were positively skewed, so the Shapiro-Wilk 
test for normality [37] was carried out. The data were not 
normally distributed, and so the square roots of the Tamura-Nei 
values were used for subsequent analyses. 

After initial investigation using ordinary least-squares (OLS) 
models, we analysed the effects of mass and generation length on 
divergence using phylogenetic generalized least-squares models 
(PGLS), using the pglmEstLambda function in CAICR [38] and 
ape 2.3 packages [39] in R. This method gives an estimate of the 
strength of the phylogenetic signal within the data (the X value: 
[40]) as well as estimating the effect of a given factor when the 
phylogeny is incorporated into the model. As with the linear 
analyses, the PGLS analysis used the square root of the Tamura- 
Nei distance for cyt b divergence to improve normality of residuals. 

We fitted all possible combinations of our three predictor 
variables and their interactions, comparing models using Akaike's 
Information Criterion (AIC; all fitted models are shown in 
Table 1). Following Burnham & Anderson [41] we accepted as 
"top models" all those within a threshold of 2 AIC points of the 
minimum within the model set. 

A phylogeny of bird species, based on genetic, behavioural and 
morphological data, was obtained from the Tree of Life [42] . The 
tree did not contain all the genera within the dataset, so the dataset 
was trimmed to contain only those genera that were available 
within the tree, and likewise the tree was trimmed to only those 
genera found within the dataset. Hence the dataset used in 
phylogenetic analyses was a subset of the original. Branch lengths 
on the phylogeny were unknown and were therefore arbitrarily set 
to 1. 



Results 

The mean Tamura-Nei divergence of cyt b between sister 
species pairs was 0.0491 ±s.d. 0.0334 (5th-95th percen- 
tiles — 0.005-0.109: « = 633 species pairs; see Table SI). If the 
analysis is restricted to the smaller sample of those genera where 
sequence data were available for at least 75% of species within the 
genus (see Methods), then the mean Tamura-Nei distance barely 
changes (mean = 0.0492±0.0336: n = 512). For this reason, all 
results presented henceforth are based on the less stringent 50% 
cut-off outlined in the Methods. 

All models and their associated coefficients are given in Table 1 . 
There was a single clear top model, which included only two 
predictor variables: percentage mass difference and generation 
length (AAIC to the next best model = 2.72). Phylogenetic signal 
was strong in this model (X = 0.375, test of X versus 0, % 2 = 19.286, 
p= 1.125X10 -5 ). According to this model, an increase of 10 
percent in the mass difference between the two members of a 
species pair was associated with an increase of approximately 
0.009 in their square-root transformed cyt b divergence (Fig. 1; 
dropping "mass difference" from top model, PGLS, -P1274 = 8.892, 
p = 0.003). In contrast, an increase of one year in the average 
generation length of a species pair was associated with a decrease 
of 0.004 in the square-root transformed cyt b divergence between 
the two species (Fig. 2; dropping "generation length" from top 
model, PGLS, F 1;274 = 6.99 7, p = 0.009). 

Superficially, there was also what appeared to be a negative 
relationship between the average mass of a volant species pair and 
the cyt b divergence of the two species, even though this variable 
did not appear in the top model. Although this relationship was 
significant in a single-factor analysis under OLS regression (Fig. 3), 
incorporating phylogenetic information gready reduced its signif- 
icance (PGLS; F li275 = 4.3 39, p = 0.038, X = 0.41 7, test of X versus 
0, % 2 = 14.749, p= 0.0001), consistent with a general pattern of 
strong phylogenetic conservatism in body mass [43]. When the 
other important predictor variables were incorporated, the 
explanatory power of body mass became non-significant (dropping 
"body mass" from the "M+D+G" model, PGLS, F i:m = 3.257, 
p = 0.072). 

Taken together, we interpret these results as supporting the idea 
that generation length and percentage mass difference have 
independent effects on cyt b difference, but mass per se does not, at 
least when phylogeny is accounted for. 

Discussion 

Our principal finding was that the cyt b divergence between 
avian sister taxa decreased as generation length increased. 
Divergence also decreased as body mass increased, but there is 
no strong and conclusive evidence that this latter effect was either 
independent of generation length, to which mass is collinearly 
related [28], or robust to phylogenetic correction. Thus while 
speciation is associated with the accumulation of some minimal 
amount of genetic divergence in neutral markers such as cyt b, 
other factors bear on the genetic divergence between recognised 
species and must be borne in mind when genetic data are used for 
species diagnosis [22-25]. Therefore, our findings caution against 
the simple expectation that there might be a uniform genetic 
divergence between sister species, particularly when comparison is 
made between species pairs drawn from different lineages. 

Cyt b divergence was positively associated with the percentage 
difference in body size of species (Fig. 1). This association was to be 
expected, as species that have diverged proportionately more in 
body size are likely to have speciated in the more distant past, 
accumulating a greater number of changes in cyt b sequence. That 
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would remain true whether the molecular clock 'ticks' at a rate 
proportional to absolute time, generation time or metabolic rate. 

While percentage difference in mass between sister taxa is 
related to their cyt b divergence, mass per se has limited effect. This 
suggests that mass-specific metabolic rate, allometrically related to 
mass, has limited effect on substitution rates and divergence [10], 
[11], a negative result supporting those previously obtained [6], 
[12]. However, our conclusion must be cautious as we did not 
analyse actual metabolic rates. 

There was a decrease in the divergence between species as their 
generation length increased (Fig. 2). We therefore tentatively 
suggest that speciation is typically recognised by taxonomists after 
the passage of a certain amount of time representing, on average, 
fewer generations in larger species, and more generations in 
smaller species. Those species with shorter generation lengths may 
have about the same proportion of DNA changes per generation 
as those with longer generations, and therefore more changes will 
accumulate per year in the species with shorter generation times 
[10]. However the amount of variance explained was very low, 
about two percent (Fig. 2, legend). Superior DNA repair in the 
longer-lived species (see Introduction) could contribute to the low 
amount of variance explained, as could the fact that the gene used 
in this study, cyt b, was mitochondrial. Such mitochondrial genes 
may show a weaker correlation with generation time than nuclear 
DNA, as there is a larger and more variable number of 
duplications of mitochondrial genomes per generation than of 
nuclear genomes [44—45]. 

Lanfear et al. [46] reported that rates of molecular evolution of 
protein-coding nuclear genes were positively correlated with rates 
of diversification in various avian lineages. If mutation promotes 
speciation, then it is tempting to conclude that the difference 
between the cyt b of sister species might be lower in fast-mutating 
lineages of smaller taxa. However that need not be the case. If a 
certain minimum amount of time is required to elapse before 
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diverging lineages are recognised as species by taxonomists, then 
the smaller species pairs with shorter generation times might show 
higher divergence, as we find. 

Implicit in our study is an assumption that the scientific 
behaviour of avian taxonomists is similar across the range of bird 
weights. Our results could be explained if cryptic species, awaiting 
'splitting', were disproportionately represented among smaller 
species of low body mass and short generation time. Objectively 
excluding this possibility would be extremely taxing and we merely 
observe that avian taxonomy at the species level remains in a state 
of flux from the smallest Phylloscopus warblers (< 10 g: [47]) to large 
albatrosses (c. 5 kg; [22], [48]). 

The fact that a signal indicating effects of body mass and 
generation length on the extent of genetic divergence between 
sister taxa can be recovered at all is perhaps remarkable. It 
confirms speciation as an ongoing biological process that continues 
to create problems for species delimitation. 
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